Search CORE

359 research outputs found

The variable-step L1 scheme preserving a compatible energy law for time-fractional Allen-Cahn equation

Author: Liao Hong-lin
Wang Jindi
Zhu Xiaohan
Publication venue
Publication date: 15/02/2021
Field of study

In this work, we revisit the adaptive L1 time-stepping scheme for solving the time-fractional Allen-Cahn equation in the Caputo's form. The L1 implicit scheme is shown to preserve a variational energy dissipation law on arbitrary nonuniform time meshes by using the recent discrete analysis tools, i.e., the discrete orthogonal convolution kernels and discrete complementary convolution kernels. Then the discrete embedding techniques and the fractional Gr\"onwall inequality were applied to establish an

L^2

norm error estimate on nonuniform time meshes. An adaptive time-stepping strategy according to the dynamical feature of the system is presented to capture the multi-scale behaviors and to improve the computational performance.Comment: 17 pages, 20 figures, 2 table

arXiv.org e-Print Archive

Re-Examining State Part C Early Intervention Program Coordinators’ Practices through a Positive Lens on Leadership: A Qualitative Secondary Analysis

Author: Gupta Sarika S.
Sherif Victoria
Zhu Xiaohan
Publication venue: NSUWorks
Publication date: 11/02/2023
Field of study

Part C early intervention is a program administered under the Individuals with Disabilities Education Act (2004) that provides services to eligible infants and toddlers with disabilities and their families. Part C coordinators oversee the program in states. This article presents an examination of state Part C program coordinators’ leadership practices. We conducted a qualitative secondary analysis to explore the practices that Part C program coordinators described using in a prior study on the processes, barriers, and solutions during a systems change. The present study used two new theoretical frameworks – organizational drivers for systems change and a strengths-based orientation – to create a positive lens on leadership through which to view identified practices. We selected five interview transcriptions with five state Part C program coordinators that contained explicit reflections about leadership behaviors in systems as our primary data set. Five categories of leadership practice emerged from a progressive inductive-deductive coding process: meeting practitioners where they are, identifying leaders, establishing consistent procedures, readying professionals, and relationships. These themes aligned with organizational drivers of systems change and highlighted the consistent use of a specific type of leadership: facilitative administration. Implications for the study of systems leadership in early intervention are discussed

NSU Works

Slimmable Networks for Contrastive Self-supervised Learning

Author: Wang Xiaohan
Yang Yi
Zhao Shuai
Zhu Linchao
Publication venue
Publication date: 30/09/2022
Field of study

Self-supervised learning makes great progress in large model pre-training but suffers in training small models. Previous solutions to this problem mainly rely on knowledge distillation and indeed have a two-stage learning procedure: first train a large teacher model, then distill it to improve the generalization ability of small ones. In this work, we present a new one-stage solution to obtain pre-trained small models without extra teachers: slimmable networks for contrastive self-supervised learning (\emph{SlimCLR}). A slimmable network contains a full network and several weight-sharing sub-networks. We can pre-train for only one time and obtain various networks including small ones with low computation costs. However, in self-supervised cases, the interference between weight-sharing networks leads to severe performance degradation. One evidence of the interference is \emph{gradient imbalance}: a small proportion of parameters produces dominant gradients during backpropagation, and the main parameters may not be fully optimized. The divergence in gradient directions of various networks may also cause interference between networks. To overcome these problems, we make the main parameters produce dominant gradients and provide consistent guidance for sub-networks via three techniques: slow start training of sub-networks, online distillation, and loss re-weighting according to model sizes. Besides, a switchable linear probe layer is applied during linear evaluation to avoid the interference of weight-sharing linear layers. We instantiate SlimCLR with typical contrastive learning frameworks and achieve better performance than previous arts with fewer parameters and FLOPs.Comment: preprint,work in progres

arXiv.org e-Print Archive

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Author: Wang Xiaohan
Yang Yi
Zhao Shuai
Zhu Linchao
Publication venue
Publication date: 29/05/2023
Field of study

Misalignment between the outputs of a vision-language (VL) model and task goal hinders its deployment. This issue can worsen when there are distribution shifts between the training and test data. To address this problem, prevailing fully test-time adaptation~(TTA) methods bootstrap themselves through entropy minimization. However, minimizing the entropy of the predictions makes the model overfit to incorrect output distributions of itself. In this work, we propose TTA with feedback to avoid such overfitting and align the model with task goals. Specifically, we adopt CLIP as reward model to provide feedback for VL models during test time in various tasks, including image classification, image-text retrieval, and image captioning. Given a single test sample, the model aims to maximize CLIP reward through reinforcement learning. We adopt a reward design with the average CLIP score of sampled candidates as the baseline. This design is simple and surprisingly effective when combined with various task-specific sampling strategies. The entire system is flexible, allowing the reward model to be extended with multiple CLIP models. Plus, a momentum buffer can be used to memorize and leverage the learned knowledge from multiple test samples. Extensive experiments demonstrate that our method significantly improves different VL models after TTA.Comment: preprint, work in progress; project URL https://github.com/mzhaoshuai/RLC

arXiv.org e-Print Archive

How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?

Author: Wang Xiaohan
Xu Xin
Zhang Ningyu
Zhu Yuqi
Publication venue
Publication date: 01/06/2023
Field of study

Scaling language models have revolutionized widespread NLP tasks, yet little comprehensively explored few-shot relation extraction with large language models. In this paper, we investigate principal methodologies, in-context learning and data generation, for few-shot relation extraction via GPT-3.5 through exhaustive experiments. To enhance few-shot performance, we further propose task-related instructions and schema-constrained data generation. We observe that in-context learning can achieve performance on par with previous prompt learning approaches, and data generation with the large language model can boost previous solutions to obtain new state-of-the-art few-shot results on four widely-studied relation extraction datasets. We hope our work can inspire future research for the capabilities of large language models in few-shot relation extraction. Code is available in https://github.com/zjunlp/DeepKE/tree/main/example/llm.Comment: SustaiNLP Workshop@ACL 202

arXiv.org e-Print Archive

Efficient Code Generation from SHIM Models

Author: Kahn Gilles
Olivier Tardieu
Stephen A. Edwards
Zhu Xiaohan
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2006
Field of study

Programming concurrent systems is substantially more difficult than programming sequential systems, yet most embedded systems need concurrency. We believe this should be addressed through higher-level models of concurrency that eliminate many of the usual challenges, such as nondeterminism arising from races. The shim model of computation provides deterministic concurrency, and there already exist ways of implementing it in hardware and software. In this work, we describe how to produce more efficient C code from shim systems. We propose two techniques: a largely mechanical one that produces tail-recursive code for simulating concurrency, and a more clever one that statically analyzes the communication pattern of multiple processes to produce code with far less overhead. Experimentally, we find our tail-recursive technique produces code that runs roughly twice as fast as a baseline; our statically-scheduled code can run up to twelve times faster

Crossref

Columbia University Academic Commons